Skip to content

Conversation

@thevilledev
Copy link
Contributor

@thevilledev thevilledev commented Jan 6, 2026

Motivation

In #897 count comparisons like count(users, .active) >= 1 were optimised by utilising the any builtin. Expressions like count(users, .active) > 100 currently iterate through the entire array even when the 101st match is found early. For large arrays where the threshold is reached quickly, this wastes resources (both CPU and memory).

This optimization enables early termination: once the count reaches the required threshold, the loop exits immediately. This is the bytecode-level approach to optimizing count comparisons without introducing new language builtins (and bloat the stdlib in the process).

Changes

There's now a new Threshold field in the BuiltinNode AST. This handles the communication between the two phases. The new optimizer countThreshold detects count comparison patterns and calculates the threshold:

  • count(arr, pred) > N -> threshold = N + 1 (exit proves > N is true)
  • count(arr, pred) >= N -> threshold = N (exit proves >= N is true)
  • count(arr, pred) < N -> threshold = N (exit proves < N is false)
  • count(arr, pred) <= N -> threshold = N + 1 (exit proves <= N is false)

Modified the compiler's count builtin handler to emit early-termination bytecode when a threshold is set.

Benchmark run:

go test ./optimizer/... -bench='BenchmarkCountThreshold' -run=^$ -benchmem -count=10

Results against master:

cpu: Apple M1 Pro
                                │    old.txt    │               new.txt               │
                                │    sec/op     │   sec/op     vs base                │
CountThresholdEarlyMatch-8        393.34µ ±  2%   20.15µ ± 7%  -94.88% (p=0.000 n=10)
CountThresholdGteEarlyMatch-8     401.33µ ± 12%   17.28µ ± 2%  -95.70% (p=0.000 n=10)
CountThresholdNoEarlyExit-8        357.1µ ±  3%   361.2µ ± 2%   +1.15% (p=0.043 n=10)
CountThresholdLargeEarlyMatch-8   391.58µ ±  5%   83.22µ ± 3%  -78.75% (p=0.000 n=10)
CountThresholdLtEarlyExit-8       411.16µ ± 42%   19.80µ ± 1%  -95.18% (p=0.000 n=10)
CountThresholdLteEarlyExit-8      397.49µ ±  2%   17.24µ ± 3%  -95.66% (p=0.000 n=10)
CountThresholdLtNoEarlyExit-8      365.2µ ± 13%   363.1µ ± 1%        ~ (p=0.529 n=10)
CountThresholdLteNoEarlyExit-8     363.0µ ±  5%   361.3µ ± 1%        ~ (p=0.315 n=10)
geomean                            384.6µ         68.21µ       -82.26%

                                │    old.txt    │                new.txt                 │
                                │     B/op      │     B/op      vs base                  │
CountThresholdEarlyMatch-8        158.26Ki ± 0%   80.92Ki ± 0%  -48.87% (p=0.000 n=10)
CountThresholdGteEarlyMatch-8     158.26Ki ± 0%   80.52Ki ± 0%  -49.12% (p=0.000 n=10)
CountThresholdNoEarlyExit-8        158.3Ki ± 0%   158.3Ki ± 0%        ~ (p=1.000 n=10) ¹
CountThresholdLargeEarlyMatch-8    158.3Ki ± 0%   101.6Ki ± 0%  -35.81% (p=0.000 n=10)
CountThresholdLtEarlyExit-8       158.26Ki ± 0%   80.91Ki ± 0%  -48.88% (p=0.000 n=10)
CountThresholdLteEarlyExit-8      158.26Ki ± 0%   80.52Ki ± 0%  -49.12% (p=0.000 n=10)
CountThresholdLtNoEarlyExit-8      158.3Ki ± 0%   158.3Ki ± 0%        ~ (p=0.474 n=10)
CountThresholdLteNoEarlyExit-8     158.3Ki ± 0%   158.3Ki ± 0%        ~ (p=1.000 n=10)
geomean                            158.3Ki        106.9Ki       -32.43%
¹ all samples are equal

                                │    old.txt    │                new.txt                │
                                │   allocs/op   │  allocs/op   vs base                  │
CountThresholdEarlyMatch-8         10006.0 ± 0%    106.0 ± 0%  -98.94% (p=0.000 n=10)
CountThresholdGteEarlyMatch-8     10006.00 ± 0%    55.00 ± 0%  -99.45% (p=0.000 n=10)
CountThresholdNoEarlyExit-8         10.01k ± 0%   10.01k ± 0%        ~ (p=1.000 n=10) ¹
CountThresholdLargeEarlyMatch-8    10.006k ± 0%   2.751k ± 0%  -72.51% (p=0.000 n=10)
CountThresholdLtEarlyExit-8        10006.0 ± 0%    105.0 ± 0%  -98.95% (p=0.000 n=10)
CountThresholdLteEarlyExit-8      10006.00 ± 0%    56.00 ± 0%  -99.44% (p=0.000 n=10)
CountThresholdLtNoEarlyExit-8       10.01k ± 0%   10.01k ± 0%        ~ (p=1.000 n=10) ¹
CountThresholdLteNoEarlyExit-8      10.01k ± 0%   10.01k ± 0%        ~ (p=1.000 n=10) ¹
geomean                             10.01k         744.6       -92.56%
¹ all samples are equal

Further comments

  • This follows the same pattern as BuiltinNode.Map, which the filterMap optimizer uses to exchange information between the compiler and the optimizer phases.
  • We add some bytecode overhead - essentially 4 extra opcodes when threshold is set.
  • The countAny optimizer still remains in use for > 0 and >= 1 scenarios. It runs before this new countThreshold optimizer.
  • The complexity was previously O(n) where n equals the array length. With this it's O(k) where k is position of Nth matching element.

Optimize the following patterns:

- count(arr, pred) > N
- count(arr, pred) >= N
- count(arr, pred) < N
- count(arr, pred) <= N

Add a threshold check inside the count loop. When the count reaches
the threshold, the loop exits early instead of scanning the entire array.

This is implemented via a new Threshold field on BuiltinNode that the
optimizer sets when detecting these patterns. The compiler then emits
bytecode that checks the count against the threshold after each increment
and jumps out of the loop when reached.

Signed-off-by: Ville Vesilehto <[email protected]>
@thevilledev thevilledev changed the title WIP perf(optimizer): add count threshold comparisons perf(optimizer): add count threshold comparisons Jan 12, 2026
@thevilledev thevilledev marked this pull request as ready for review January 12, 2026 20:20
@antonmedv
Copy link
Member

Very nice optimization! It will be very useful.

@antonmedv antonmedv merged commit 4ff281d into expr-lang:master Jan 12, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants